Localization: AI translation primitives by jkmassel · Pull Request #25705 · wordpress-mobile/WordPress-iOS

jkmassel · 2026-06-26T04:48:30Z

Reusable Ruby primitives for the AI translation tier of the localization pipeline — the service behind the human ?? AI ?? English floor whose AI stub (ai_translate_plural → nil) was left open in #25688. All of it is pure prompt-building + validation with the Anthropic SDK call injected, so every line of logic is unit-testable without the gem or the network; the live SDK wiring is one thin factory.

Nothing is wired into a lane yet — these are the building blocks, deliberately decoupled from the GlotPress / catalog plumbing that's still in flux.

Summary

TranslationValidator — the format-specifier safety gate. A machine translation must preserve the source's printf/NSString arguments exactly (count + type; positional %1$@ may reorder, which is the whole point). A mismatch is rejected, so a broken translation falls back to English rather than shipping a crash in a locale no one on the team can read.
Glossary — brand do-not-translate list (WordPress, Jetpack, …) plus per-locale preferred terms and a register note, rendered into the prompt. Pure data in; sourcing it (the WordPress.org per-locale glossaries / style guides) is pre-processing handed in later.
AITranslator — three translation shapes: translate (one string), translate_plural (a whole CLDR form-set in one request, so the model keeps one stem across forms), and translate_all (batched regular strings). Structured outputs (output_config json_schema) enforce the reply shape on the plural and batch paths.
AnthropicBatch — the async Message Batches path for a bulk backfill (~50% cheaper): submit → await (poll) → results → collect_batch, plus all the SDK-shape glue, shared with the sync path so the request shape can't drift between them.

Design notes

The SDK call is injected

AITranslator takes a complete: callable (and the batch path takes a client), so prompt-building, validation, batching, and result-assembly are all exercised by unit tests with a canned reply. AITranslator.with_anthropic / AnthropicBatch.client build the live instances (default model claude-opus-4-8). The two live-API bugs we hit — a custom_id that didn't match ^[a-zA-Z0-9_-]{1,64}$, and results_streaming yielding raw JSONL strings rather than typed objects — were both invisible to a permissive fake client and only caught by real calls. The SDK seams are now live-verified, not just fake-tested.

The placeholder gate is a hard floor

Every machine cell — single string, plural form, or batch entry — passes through TranslationValidator before it's returned. This is the same invariant the catalog needs_review machinery already assumes: the AI tier can only ever produce a safe translation or nil.

Plural consistency

Translating each CLDR category independently let the model drift between synonyms across forms (Polish słowo → wyrazy → słów). translate_plural sends the whole form-set in one request and instructs one consistent stem; verified it now yields słowo / słowa / słów.

Verified live

Against the real API (fr/de/ja/pl): verb/adjective disambiguated from the dev comment (Suivre/Suivi, Folgen/Gefolgt), brand terms kept verbatim, German informal register (Dein), French space-before-?, plural stems consistent, and a full Batch round-trip (submit → await → collect) returning Suivre / %1$@ vues mapped back to keys and grouped by locale. Placeholders were preserved throughout — the gate never had to fire on real output.

Not in this PR (deliberate)

Lane wiring — download_localized_plurals still calls the nil stub; switching it to AITranslator#for_plural (and a translate_all pass over the regular catalog) is a separate change.
A .strings / .xcstrings reader — these consume a {key, source, comment} array and plural form-sets; building that from the real catalog (plus any pre-processing) is upstream of here.
Glossary sourcing — Glossary is data-in; pulling the WordPress.org per-locale glossaries / style guides into it comes later.
Bulk-backfill ergonomics — model tiering for cost, and the submit-now / collect-later split for very large batches (vs. the synchronous await).

Test plan

All checks are pure Ruby — stdlib minitest, no bundle, no network:

ruby fastlane/lanes/translation_validator_test.rb — 9
ruby fastlane/lanes/translation_glossary_test.rb — 5
ruby fastlane/lanes/anthropic_batch_test.rb — 6
ruby fastlane/lanes/ai_translator_test.rb — 30
rubocop clean on all eight files
Live translation pass (needs ANTHROPIC_API_KEY + bundle install): ruby fastlane/lanes/ai_translator.rb fr "You have %1$d new posts" "Notification. %1$d is the count."

Reusable, unit-tested Ruby primitives for the AI translation tier of the localization pipeline — the service behind the `human ?? AI ?? English` floor whose AI stub was left open in #25688. Pure prompt-building and validation with the Anthropic SDK call injected, so the logic is testable without the gem or the network. Not wired into any lane yet. - TranslationValidator: format-specifier safety gate — a translation must preserve the source's placeholders (count and type; positional reordering allowed), or it is rejected and falls back to English. - Glossary: brand do-not-translate list plus per-locale terms and register. - AITranslator: single-string, per-key plural form-set (one consistent stem across CLDR forms), and batched string translation, with structured-output (output_config) enforcement. - AnthropicBatch: Message Batches submit/await/results/collect for bulk backfill. 50 unit tests, rubocop clean.

The pure-Ruby unit suites (TranslationValidator, Glossary, AnthropicBatch, AITranslator) weren't executed by any pipeline step — the "Unit Tests" jobs are the Xcode/XCTest suites, and rubocop (via Danger) only lints them. Add a lightweight Buildkite step that runs each fastlane/lanes/*_test.rb with plain ruby (stdlib minitest — no Xcode, no app build, no bundle). Runs unconditionally rather than behind should-skip-job.sh --job-type validation, which skips on tooling-only changes — i.e. exactly the PRs that touch these files.

The previous note advertised for_plural as a one-line swap to wire the live translation tier. That path routes each plural form through single-string translate, so it forfeits the cross-form consistency translate_plural exists to provide — the lemma drift PLURAL_OUTPUT warns about. Relabel for_plural as the per-cell fallback and point the live-tier wiring at translate_plural's form-set seam.

…f a translated value (#25721) * Localization: assert clean() preserves quotes that are part of a value A translation whose value is itself wrapped in quotation marks must keep them; only the model's cosmetic wrapping around a raw single-string reply should be stripped. Cover both structured paths (translate_plural, translate_all). * Localization: stop clean() stripping quotes that are part of a value clean() removes the cosmetic quotes a model wraps around a raw single-string reply. The plural and batch paths ran it on values already decoded by JSON.parse, so a value whose own content is quoted (e.g. "Reader") lost its quotes too. Run clean() only on the raw single-string reply in translate(); the JSON-decoded plural/batch values are whitespace-trimmed but never quote-stripped, since JSON.parse has already removed the structural quotes and anything left is content. Also covers the async collect_batch path (shares validated_batch) and curly-quoted values for free. Satisfies the tests added in d923c25. * Localization: cover curly quotes and the batch path in the quote-preservation tests Two more regression guards for the clean()-on-decoded-value fix: a curly/smart-quoted value (“Reader”) through translate_all, since clean() strips “ ” as well as straight quotes; and a quoted value through the async collect_batch path, which shares validated_batch with translate_all. Both fail against the pre-fix code, so a narrower fix — only un-stripping straight quotes, or only the sync path — cannot slip past. --------- Co-authored-by: Jeremy Massel <1123407+jkmassel@users.noreply.github.com>

oguzkocer

Looks good

I've left one nitpick comment which can possibly apply to a couple other function names, but it's a judgement call, so I'll leave the decision to you.

oguzkocer · 2026-06-30T21:48:56Z

+  end
+
+  # Map each numbered item to its validated translation by key; drop empty/placeholder-breaking ones.
+  def validated_batch(parsed, numbered)


At call sites, I read this function name as "The contents in this batch are already validated" which confused me. However, the actual implementation is "Take this batch and return the validated values" which means it's going to filter them.

Specifically, the 'validated' verb form misled me and for a minute I thought we weren't calling TranslationValidator.placeholders_match? for batches.

This could totally be a me problem, but I think an active version of the verb, or a more verbose version of the function name overall could improve readability at call sites.

Renamed to select_valid_batch, which is Ruby-idiomatic. Also renamed validated_forms to select_valid_forms to match in f66c215

At call sites the validated_ prefix reads as an adjective — "the batch that's already been validated" — when both methods are in fact where batch and plural-set translations run the placeholder gate, returning only the passing subset. select_valid_batch / select_valid_forms make the filtering action plain where they're called. Pure rename of two private helpers; no behavior change.

jkmassel added 2 commits June 26, 2026 11:35

jkmassel force-pushed the jkmassel/claude-string-translation branch from 28b37b5 to 7412850 Compare June 26, 2026 17:37

This was referenced Jun 26, 2026

Localization: wire AI plural translation into the GlotPress reverse fold #25710

Draft

Localization: stage regular-string translations into Localizable.xcstrings (manual) #25713

Draft

jkmassel marked this pull request as ready for review June 29, 2026 17:34

jkmassel requested a review from a team as a code owner June 29, 2026 17:34

jkmassel added Tooling Build, Release, and Validation Tools [Type] Enhancement labels Jun 29, 2026

jkmassel added this to the 27.1 milestone Jun 29, 2026

jkmassel requested a review from oguzkocer June 29, 2026 17:38

oguzkocer mentioned this pull request Jun 30, 2026

Localization: stop the AI translator stripping quotes that are part of a translated value #25721

Merged

3 tasks

oguzkocer approved these changes Jun 30, 2026

View reviewed changes

jkmassel enabled auto-merge June 30, 2026 22:36

jkmassel added this pull request to the merge queue Jun 30, 2026

Merged via the queue into trunk with commit c27d2b2 Jun 30, 2026
26 checks passed

jkmassel deleted the jkmassel/claude-string-translation branch June 30, 2026 23:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Localization: AI translation primitives#25705

Localization: AI translation primitives#25705
jkmassel merged 5 commits into
trunkfrom
jkmassel/claude-string-translation

jkmassel commented Jun 26, 2026

Uh oh!

dangermattic commented Jun 26, 2026 •

edited

Loading

Uh oh!

wpmobilebot commented Jun 26, 2026 •

edited

Loading

Uh oh!

wpmobilebot commented Jun 26, 2026 •

edited

Loading

Uh oh!

oguzkocer left a comment

Uh oh!

oguzkocer Jun 30, 2026

Uh oh!

jkmassel Jun 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

jkmassel commented Jun 26, 2026

Summary

Design notes

The SDK call is injected

The placeholder gate is a hard floor

Plural consistency

Verified live

Not in this PR (deliberate)

Test plan

Related

Uh oh!

dangermattic commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wpmobilebot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wpmobilebot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oguzkocer left a comment

Choose a reason for hiding this comment

Uh oh!

oguzkocer Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

jkmassel Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dangermattic commented Jun 26, 2026 •

edited

Loading

wpmobilebot commented Jun 26, 2026 •

edited

Loading

wpmobilebot commented Jun 26, 2026 •

edited

Loading